YOLOSA: Object detection based on 2D local feature superimposed self-attention
نویسندگان
چکیده
We analyzed the network structure of real-time object detection models and found that features in feature concatenation stage are very rich. Applying an attention module here can effectively improve accuracy model. However, commonly used or self-attention shows poor performance inference efficiency. Therefore, we propose a novel module, called 2D local superimposed self-attention, for neck network. This reflects global through receptive fields. also optimize efficient decoupled head AB-OTA, achieve SOTA results. Average precisions 49.0% (71FPS, 14ms), 46.1% (85FPS, 11.7ms), 39.1% (107FPS, 9.3ms) were obtained large, medium, small-scale built using our proposed improvements. Our exceeded YOLOv5 by 0.8% – 3.1% average precision.
منابع مشابه
Object Recognition based on local feature trajectories1)
This paper presents a novel approach for extracting discriminative descriptions of 3-D objects using spatio-temporal information. In particular, local features are tracked in image sequences leading to local trajectories containing dynamic information. These trajectories are judged with respect to their quality and robustness and finally each of them is assigned a single local descriptor from a...
متن کاملFeature extraction: Face detection techniques and 3D object recognition based on local feature extraction
Feature extraction and representation is an integral part of multimedia processing. How to extract ideal features which would reflect the intrinsic content of the images as complete as possible is still a very challenging problem in the field of computer science. However, very little progress has been achieved to find a solution to this problem in the last decades. So in this paper, we focus ou...
متن کاملObject localization in 2D images based on Kohonen's self-organization feature maps
This paper presents a hybrid approach for neural object localization and recognition in 2D grey level images. The system combines an auto-associative network , two self-organization feature maps (SOMs), and a three layer feed-forward network trained with dynamic learning vector quantization (DLVQ). By using a hidden layer smaller than the input/output layers, the auto-associative network can be...
متن کاملObject Recognition based on Local Steering Kernel and SVM
The proposed method is to recognize objects based on application of Local Steering Kernels (LSK) as Descriptors to the image patches. In order to represent the local properties of the images, patch is to be extracted where the variations occur in an image. To find the interest point, Wavelet based Salient Point detector is used. Local Steering Kernel is then applied to the resultant pixels, in ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Pattern Recognition Letters
سال: 2023
ISSN: ['1872-7344', '0167-8655']
DOI: https://doi.org/10.1016/j.patrec.2023.03.003